I liked the Financial Times plots for tracking the evolution of COVID-19 (https://www.ft.com/coronavirus-latest), but then they changed to different plots. So here I am more-or-less reproducing those plots (and adding some others). This is generated from an Rmarkdown document that I’ll be rerendering daily.
For now using the country data from https://covid.ourworldindata.org:
cases = read.csv("https://covid.ourworldindata.org/data/ecdc/total_cases.csv",
stringsAsFactors=FALSE)
cases$date = as.POSIXct(cases$date)
cases$doy = round(as.numeric(difftime(cases$date, as.POSIXct("2019-12-31"), units="days")))
deaths = read.csv("https://covid.ourworldindata.org/data/ecdc/total_deaths.csv",
stringsAsFactors=FALSE)
deaths$date = as.POSIXct(deaths$date)
deaths$doy = round(as.numeric(difftime(deaths$date, as.POSIXct("2019-12-31"), units="days")))
Gray dashed lines are doubling times of 1, 2, 3, and 7 days (from steepest to shallowest)
Gray dashed lines are doubling times of 1, 2, 3, and 7 days (from steepest to shallowest)
The raw daily data is quite noisy, so there are also smoothed versions.
By “naive” I mean: at any point in time, divide the total cumulative number of deaths by the total cumulative number of confirmed cases. This will be biased high because the denominator is too small because not all cases are detected (due to lack of testing), but on the other hand will be biased low because some active cases will result in deaths eventually. Should eventually converge on the true CFR if testing becomes widespread.
So far what I’ve found is from the NY Times. It is pretty strangely structured, but oh well:
states = read.csv("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv",
stringsAsFactors=FALSE)
states$date = as.POSIXct(states$date)
states$doy = round(as.numeric(difftime(states$date, as.POSIXct("2019-12-31"), units="days")))
# Get this into a more sensible structure:
us.deaths = t(tapply(states$deaths,
list(factor(states$state, levels=sort(unique(states$state))),
factor(states$doy, levels=min(states$doy):max(states$doy))),
identity))
us.deaths = as.data.frame(us.deaths)
dates = sort(unique(states$date))
us.deaths$date = dates
doys = sort(unique(states$doy))
us.deaths$doy = doys
us.cases = t(tapply(states$cases,
list(factor(states$state, levels=sort(unique(states$state))),
factor(states$doy, levels=min(states$doy):max(states$doy))),
identity))
us.cases = as.data.frame(us.cases)
us.cases$date = dates
us.cases$doy = doys
Gray dashed lines are doubling times of 1, 2, 3, and 7 days (from steepest to shallowest)
Gray dashed lines are doubling times of 1, 2, 3, and 7 days (from steepest to shallowest)
The raw daily data is quite noisy, so there are also smoothed versions.